Annotation-based Document Retrieval with Four-Valued Probabilistic Datalog

نویسندگان

  • Ingo Frommholz
  • Ulrich Thiel
  • Thomas Kamps
چکیده

The COLLATE system (collaboratory for annotation, indexing and retrieval of digitized historical archive material) provides film researchers with a collaborative environment in which historic documents about European films can be analysed, interpreted and discussed, using nested annotations and discourse structure relations among them. Annotations are metadata, and annotation threads form a hypertext containing positive and negative links, constituting a certain kind of context exploitable for document retrieval. In this paper, we discuss a solution for using annotations for information retrieval. To exploit annotation threads which consist of nested annotations and typed links between them, an annotation-based retrieval approach should have to cope with negative and contradictory statements. The nested annotation retrieval approach (NARA) is an approach addressing these issues. Based on this, we present NARAlog, an implementation using four-valued probabilistic datalog (FVPD), able to perform an in-depth analysis of annotation threads and to deal with contradictory statements.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Annotation-based Information Retrieval

One of the main reason why we use papers is because making annotation is more easier. Moreover, using papers which have already been annotated give us opportunity to find a part of document which is important. Giving the remark or note to some parts of document explanation can be extended and go beyond of the topic. It was the motivation for creating annotation-based retrieval systems. They are...

متن کامل

Annotation-Based Document Retrieval with Probabilistic Logics

Annotations are an important part in today’s digital libraries and Web information systems as an instrument for interactive knowledge creation. Annotation-based document retrieval aims at exploiting annotations as a rich source of evidence for document search. The POLAR framework supports annotation-based document search by translating POLAR programs into four-valued probabilistic datalog and a...

متن کامل

Information Retrieval Methods for Multimedia Objects

We describe five major concepts that are essential for multimedia retrieval: uncertain inference addresses vagueness of queries and imprecision of content representations. Predicate logic allows for dealing with spatial and temporal relationships. The document structure has to be considered in order to retrieve the most relevant part of a document in response to a query. Whereas fact retrieval ...

متن کامل

HySpirit - A Probabilistic Inference Engine for Hypermedia Retrieval in Large Databases

HySpirit is a retrieval engine for hypermedia retrieval integrating concepts from information retrieval (IR) and deductive databases. The logical view on IR models retrieval as uncertain inference, for which we use probabilistic reasoning. Since the expressiveness of classical IR models is not suucient for hypermedia retrieval, HySpirit is based on a probabilistic version of Datalog. In hyperme...

متن کامل

PIRE: An Extensible IR Engine Based on Probabilistic Datalog

This paper introduces PIRE, a probabilistic IR engine. For both document indexing and retrieval, PIRE makes heavy use of probabilistic Datalog, a probabilistic extension of predicate Horn logics. Using such a logical framework together with probability theory allows for defining and using data types (e.g. text, names, numbers), different weighting schemes (e.g. normalised tf, tf.idf or BM25) an...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004